Transposition Invariant String Mat hing ⋆
نویسندگان
چکیده
Veli Mäkinen a,1, Gonzalo Navarro b,2, and Esko Ukkonen a,1 aDepartment of Computer S ien e, P.O Box 26 (Teollisuuskatu 23), FIN-00014 University of Helsinki, Finland. bCenter for Web Resear h, Department of Computer S ien e, University of Chile Blan o En alada 2120, Santiago, Chile. Abstra t Given strings A = a1a2 . . . am and B = b1b2 . . . bn over an alphabet Σ ⊆ U, where U is some numeri al universe losed under addition and subtra tion, and a distan e fun tion d(A,B) that gives the s ore of the best (partial) mat hing of A and B, the transposition invariant distan e is mint∈U{d(A + t, B)}, where A+ t = (a1+ t)(a2+ t) . . . (am+ t). We study the problem of omputing the transposition invariant distan e for various distan e (and similarity) fun tions d, in luding Hamming distan e, longest ommon subsequen e (LCS), Levenshtein distan e, and their versions where the exa t mat hing ondition is repla ed by an approximate one. For all these problems we give algorithms whose time omplexities are lose to the known upper bounds without transposition invarian e, and for some we a hieve these upper bounds. In parti ular, we show how sparse dynami programming an be used to solve transposition invariant problems, and its onne tion with multidimensional range-minimum sear h. As a byprodu t, we give improved sparse dynami programming algorithms to ompute LCS and Levenshtein distan e.
منابع مشابه
Mat hing Numeri Strings under Noise
Abstra t. Numeri string is a sequen e of symbols from an alphabet U, where U is some numeri al universe losed under addition and subtra tion. Given two numeri strings A = a 1 a m and B = b 1 b n and a distan e fun tion d(A;B) that gives the s ore of the best (partial) mat hing of A and B, the transposition invariant distan e is min t2U fd(A + t; B)g, where A + t = (a 1 + t)(a 2 + t) : : : (a m ...
متن کاملMatching Numeric Strings under Noise
Abstra t. Numeri string is a sequen e of symbols from an alphabet U, where U is some numeri al universe losed under addition and subtra tion. Given two numeri strings A = a 1 a m and B = b 1 b n and a distan e fun tion d(A;B) that gives the s ore of the best (partial) mat hing of A and B, the transposition invariant distan e is min t2U fd(A + t; B)g, where A + t = (a 1 + t)(a 2 + t) : : : (a m ...
متن کاملA Note on Randomized Algorithm for String Mat hing with Mismat hes
Abstra t. Atallah et al. [ACD01℄ introdu ed a randomized algorithm for string mat hing with mismat hes, whi h utilized fast Fourier transformation (FFT) to ompute onvolution. It estimates the s ore ve tor of mat hes between text string and a pattern string, i.e. the ve tor obtained when the pattern is slid along the text, and the number of mat hes is ounted for ea h position. In this paper, we ...
متن کاملFaster than Fft : Rotation
In this arti le we onsider the rotation invariant template mat hing problem from the ombinatorial point of view. The problem is to nd the pla es and orientations in an image where a pattern an be superimposed so that it is similar enough to the image. The traditional approa h to this problem uses the Fast Fourier Transformation. We present a ombinatorial approa h that is inspired by string mat ...
متن کامل